Overview

Dataset statistics

Number of variables29
Number of observations82580
Missing cells3746
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory18.3 MiB
Average record size in memory232.0 B

Variable types

Numeric12
Categorical17

Alerts

Nationality has a high cardinality: 188 distinct values High cardinality
Unnamed: 0 is highly correlated with ID and 3 other fieldsHigh correlation
ID is highly correlated with Unnamed: 0 and 3 other fieldsHigh correlation
DaysSinceCreation is highly correlated with Unnamed: 0 and 3 other fieldsHigh correlation
AverageLeadTime is highly correlated with BookingsCheckedIn and 3 other fieldsHigh correlation
BookingsCheckedIn is highly correlated with AverageLeadTime and 5 other fieldsHigh correlation
PersonsNights is highly correlated with AverageLeadTime and 5 other fieldsHigh correlation
RoomNights is highly correlated with AverageLeadTime and 5 other fieldsHigh correlation
DaysSinceLastStay is highly correlated with Unnamed: 0 and 6 other fieldsHigh correlation
DaysSinceFirstStay is highly correlated with Unnamed: 0 and 6 other fieldsHigh correlation
Total Revenue is highly correlated with AverageLeadTime and 3 other fieldsHigh correlation
Unnamed: 0 is highly correlated with ID and 3 other fieldsHigh correlation
ID is highly correlated with Unnamed: 0 and 3 other fieldsHigh correlation
DaysSinceCreation is highly correlated with Unnamed: 0 and 3 other fieldsHigh correlation
BookingsCheckedIn is highly correlated with RoomNightsHigh correlation
PersonsNights is highly correlated with RoomNights and 1 other fieldsHigh correlation
RoomNights is highly correlated with BookingsCheckedIn and 2 other fieldsHigh correlation
DaysSinceLastStay is highly correlated with Unnamed: 0 and 3 other fieldsHigh correlation
DaysSinceFirstStay is highly correlated with Unnamed: 0 and 3 other fieldsHigh correlation
Total Revenue is highly correlated with PersonsNights and 1 other fieldsHigh correlation
Unnamed: 0 is highly correlated with ID and 3 other fieldsHigh correlation
ID is highly correlated with Unnamed: 0 and 3 other fieldsHigh correlation
DaysSinceCreation is highly correlated with Unnamed: 0 and 3 other fieldsHigh correlation
AverageLeadTime is highly correlated with BookingsCheckedIn and 2 other fieldsHigh correlation
BookingsCheckedIn is highly correlated with AverageLeadTime and 5 other fieldsHigh correlation
PersonsNights is highly correlated with AverageLeadTime and 3 other fieldsHigh correlation
RoomNights is highly correlated with AverageLeadTime and 3 other fieldsHigh correlation
DaysSinceLastStay is highly correlated with Unnamed: 0 and 4 other fieldsHigh correlation
DaysSinceFirstStay is highly correlated with Unnamed: 0 and 4 other fieldsHigh correlation
Total Revenue is highly correlated with BookingsCheckedIn and 2 other fieldsHigh correlation
MarketSegment is highly correlated with DistributionChannelHigh correlation
DistributionChannel is highly correlated with MarketSegmentHigh correlation
Unnamed: 0 is highly correlated with ID and 3 other fieldsHigh correlation
ID is highly correlated with Unnamed: 0 and 3 other fieldsHigh correlation
DaysSinceCreation is highly correlated with Unnamed: 0 and 3 other fieldsHigh correlation
AverageLeadTime is highly correlated with DaysSinceLastStay and 1 other fieldsHigh correlation
BookingsCanceled is highly correlated with BookingsCheckedIn and 1 other fieldsHigh correlation
BookingsNoShowed is highly correlated with BookingsCheckedIn and 2 other fieldsHigh correlation
BookingsCheckedIn is highly correlated with BookingsCanceled and 4 other fieldsHigh correlation
PersonsNights is highly correlated with BookingsNoShowed and 3 other fieldsHigh correlation
RoomNights is highly correlated with BookingsCanceled and 4 other fieldsHigh correlation
DaysSinceLastStay is highly correlated with Unnamed: 0 and 4 other fieldsHigh correlation
DaysSinceFirstStay is highly correlated with Unnamed: 0 and 4 other fieldsHigh correlation
DistributionChannel is highly correlated with MarketSegmentHigh correlation
MarketSegment is highly correlated with DistributionChannelHigh correlation
Total Revenue is highly correlated with BookingsCheckedIn and 2 other fieldsHigh correlation
Age has 3746 (4.5%) missing values Missing
BookingsCanceled is highly skewed (γ1 = 57.95186518) Skewed
BookingsCheckedIn is highly skewed (γ1 = 27.07036159) Skewed
Unnamed: 0 is uniformly distributed Uniform
ID is uniformly distributed Uniform
Unnamed: 0 has unique values Unique
ID has unique values Unique
AverageLeadTime has 22157 (26.8%) zeros Zeros
BookingsCanceled has 82462 (99.9%) zeros Zeros
BookingsCheckedIn has 19394 (23.5%) zeros Zeros
PersonsNights has 19396 (23.5%) zeros Zeros
RoomNights has 19394 (23.5%) zeros Zeros
Total Revenue has 19667 (23.8%) zeros Zeros

Reproduction

Analysis started2022-09-06 16:39:30.898385
Analysis finished2022-09-06 16:40:30.168443
Duration59.27 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIFORM
UNIQUE

Distinct82580
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41289.5
Minimum0
Maximum82579
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size645.3 KiB

Quantile statistics

Minimum0
5-th percentile4128.95
Q120644.75
median41289.5
Q361934.25
95-th percentile78450.05
Maximum82579
Range82579
Interquartile range (IQR)41289.5

Descriptive statistics

Standard deviation23838.93695
Coefficient of variation (CV)0.5773607564
Kurtosis-1.2
Mean41289.5
Median Absolute Deviation (MAD)20645
Skewness0
Sum3409686910
Variance568294915
MonotonicityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
550801
 
< 0.1%
550581
 
< 0.1%
550571
 
< 0.1%
550561
 
< 0.1%
550551
 
< 0.1%
550541
 
< 0.1%
550531
 
< 0.1%
550521
 
< 0.1%
550511
 
< 0.1%
Other values (82570)82570
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
825791
< 0.1%
825781
< 0.1%
825771
< 0.1%
825761
< 0.1%
825751
< 0.1%
825741
< 0.1%
825731
< 0.1%
825721
< 0.1%
825711
< 0.1%
825701
< 0.1%

ID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIFORM
UNIQUE

Distinct82580
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41290.5
Minimum1
Maximum82580
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size645.3 KiB

Quantile statistics

Minimum1
5-th percentile4129.95
Q120645.75
median41290.5
Q361935.25
95-th percentile78451.05
Maximum82580
Range82579
Interquartile range (IQR)41289.5

Descriptive statistics

Standard deviation23838.93695
Coefficient of variation (CV)0.5773467735
Kurtosis-1.2
Mean41290.5
Median Absolute Deviation (MAD)20645
Skewness0
Sum3409769490
Variance568294915
MonotonicityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11
 
< 0.1%
550811
 
< 0.1%
550591
 
< 0.1%
550581
 
< 0.1%
550571
 
< 0.1%
550561
 
< 0.1%
550551
 
< 0.1%
550541
 
< 0.1%
550531
 
< 0.1%
550521
 
< 0.1%
Other values (82570)82570
> 99.9%
ValueCountFrequency (%)
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
101
< 0.1%
ValueCountFrequency (%)
825801
< 0.1%
825791
< 0.1%
825781
< 0.1%
825771
< 0.1%
825761
< 0.1%
825751
< 0.1%
825741
< 0.1%
825731
< 0.1%
825721
< 0.1%
825711
< 0.1%

Nationality
Categorical

HIGH CARDINALITY

Distinct188
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
FRA
12307 
PRT
11382 
DEU
10164 
GBR
8610 
ESP
4864 
Other values (183)
35253 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters247740
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)< 0.1%

Sample

1st rowPRT
2nd rowPRT
3rd rowDEU
4th rowFRA
5th rowFRA

Common Values

ValueCountFrequency (%)
FRA12307
14.9%
PRT11382
13.8%
DEU10164
12.3%
GBR8610
10.4%
ESP4864
 
5.9%
USA3398
 
4.1%
ITA3301
 
4.0%
BEL3111
 
3.8%
BRA2710
 
3.3%
NLD2698
 
3.3%
Other values (178)20035
24.3%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
fra12307
14.9%
prt11382
13.8%
deu10164
12.3%
gbr8610
10.4%
esp4864
 
5.9%
usa3398
 
4.1%
ita3301
 
4.0%
bel3111
 
3.8%
bra2710
 
3.3%
nld2698
 
3.3%
Other values (178)20035
24.3%

Most occurring characters

ValueCountFrequency (%)
R41541
16.8%
A26519
10.7%
E22239
9.0%
U17848
 
7.2%
P17495
 
7.1%
T16755
 
6.8%
B14932
 
6.0%
D13879
 
5.6%
F13022
 
5.3%
S12285
 
5.0%
Other values (16)51225
20.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter247740
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R41541
16.8%
A26519
10.7%
E22239
9.0%
U17848
 
7.2%
P17495
 
7.1%
T16755
 
6.8%
B14932
 
6.0%
D13879
 
5.6%
F13022
 
5.3%
S12285
 
5.0%
Other values (16)51225
20.7%

Most occurring scripts

ValueCountFrequency (%)
Latin247740
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R41541
16.8%
A26519
10.7%
E22239
9.0%
U17848
 
7.2%
P17495
 
7.1%
T16755
 
6.8%
B14932
 
6.0%
D13879
 
5.6%
F13022
 
5.3%
S12285
 
5.0%
Other values (16)51225
20.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII247740
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R41541
16.8%
A26519
10.7%
E22239
9.0%
U17848
 
7.2%
P17495
 
7.1%
T16755
 
6.8%
B14932
 
6.0%
D13879
 
5.6%
F13022
 
5.3%
S12285
 
5.0%
Other values (16)51225
20.7%

Age
Real number (ℝ)

MISSING

Distinct105
Distinct (%)0.1%
Missing3746
Missing (%)4.5%
Infinite0
Infinite (%)0.0%
Mean45.46855418
Minimum-11
Maximum122
Zeros40
Zeros (%)< 0.1%
Negative17
Negative (%)< 0.1%
Memory size645.3 KiB

Quantile statistics

Minimum-11
5-th percentile17
Q134
median46
Q357
95-th percentile72
Maximum122
Range133
Interquartile range (IQR)23

Descriptive statistics

Standard deviation16.52627567
Coefficient of variation (CV)0.3634660475
Kurtosis-0.2865254005
Mean45.46855418
Median Absolute Deviation (MAD)12
Skewness-0.156066243
Sum3584468
Variance273.1177875
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
502015
 
2.4%
512013
 
2.4%
541965
 
2.4%
531913
 
2.3%
491871
 
2.3%
521867
 
2.3%
481858
 
2.2%
551824
 
2.2%
471820
 
2.2%
461717
 
2.1%
Other values (95)59971
72.6%
(Missing)3746
 
4.5%
ValueCountFrequency (%)
-112
 
< 0.1%
-104
 
< 0.1%
-92
 
< 0.1%
-73
 
< 0.1%
-63
 
< 0.1%
-13
 
< 0.1%
040
 
< 0.1%
1105
0.1%
2122
0.1%
3120
0.1%
ValueCountFrequency (%)
1221
 
< 0.1%
1142
 
< 0.1%
1133
< 0.1%
1101
 
< 0.1%
1091
 
< 0.1%
961
 
< 0.1%
923
< 0.1%
911
 
< 0.1%
902
 
< 0.1%
895
< 0.1%

DaysSinceCreation
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1083
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean459.1381569
Minimum12
Maximum1095
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size645.3 KiB

Quantile statistics

Minimum12
5-th percentile60
Q1183
median406
Q3728
95-th percentile998
Maximum1095
Range1083
Interquartile range (IQR)545

Descriptive statistics

Standard deviation311.3092951
Coefficient of variation (CV)0.6780296745
Kurtosis-1.152943924
Mean459.1381569
Median Absolute Deviation (MAD)249
Skewness0.3935969279
Sum37915629
Variance96913.47721
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
212298
 
0.4%
232247
 
0.3%
22233
 
0.3%
281227
 
0.3%
101225
 
0.3%
195220
 
0.3%
78211
 
0.3%
217211
 
0.3%
206207
 
0.3%
85206
 
0.2%
Other values (1073)80295
97.2%
ValueCountFrequency (%)
1236
 
< 0.1%
1350
0.1%
1495
0.1%
15124
0.2%
1642
 
0.1%
1730
 
< 0.1%
1839
 
< 0.1%
1926
 
< 0.1%
2050
0.1%
2182
0.1%
ValueCountFrequency (%)
109570
0.1%
109490
0.1%
1093103
0.1%
109216
 
< 0.1%
109199
0.1%
109021
 
< 0.1%
108910
 
< 0.1%
108815
 
< 0.1%
10875
 
< 0.1%
108620
 
< 0.1%

AverageLeadTime
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct418
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean66.55720513
Minimum-1
Maximum588
Zeros22157
Zeros (%)26.8%
Negative10
Negative (%)< 0.1%
Memory size645.3 KiB

Quantile statistics

Minimum-1
5-th percentile0
Q10
median30
Q3104
95-th percentile241
Maximum588
Range589
Interquartile range (IQR)104

Descriptive statistics

Standard deviation87.92899502
Coefficient of variation (CV)1.321104076
Kurtosis4.464738599
Mean66.55720513
Median Absolute Deviation (MAD)30
Skewness1.906882747
Sum5496294
Variance7731.508165
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
022157
26.8%
11693
 
2.1%
21051
 
1.3%
61040
 
1.3%
51014
 
1.2%
41000
 
1.2%
3965
 
1.2%
7953
 
1.2%
8903
 
1.1%
9670
 
0.8%
Other values (408)51134
61.9%
ValueCountFrequency (%)
-110
 
< 0.1%
022157
26.8%
11693
 
2.1%
21051
 
1.3%
3965
 
1.2%
41000
 
1.2%
51014
 
1.2%
61040
 
1.3%
7953
 
1.2%
8903
 
1.1%
ValueCountFrequency (%)
58819
< 0.1%
57410
< 0.1%
54922
< 0.1%
54610
< 0.1%
5432
 
< 0.1%
5425
 
< 0.1%
5415
 
< 0.1%
53521
< 0.1%
5331
 
< 0.1%
52111
< 0.1%

BookingsCanceled
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.002046500363
Minimum0
Maximum9
Zeros82462
Zeros (%)99.9%
Negative0
Negative (%)0.0%
Memory size645.3 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.06717662276
Coefficient of variation (CV)32.82512135
Kurtosis5207.131053
Mean0.002046500363
Median Absolute Deviation (MAD)0
Skewness57.95186518
Sum169
Variance0.004512698645
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
082462
99.9%
192
 
0.1%
212
 
< 0.1%
38
 
< 0.1%
45
 
< 0.1%
91
 
< 0.1%
ValueCountFrequency (%)
082462
99.9%
192
 
0.1%
212
 
< 0.1%
38
 
< 0.1%
45
 
< 0.1%
91
 
< 0.1%
ValueCountFrequency (%)
91
 
< 0.1%
45
 
< 0.1%
38
 
< 0.1%
212
 
< 0.1%
192
 
0.1%
082462
99.9%

BookingsNoShowed
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
0
82536 
1
 
36
2
 
7
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters82580
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
082536
99.9%
136
 
< 0.1%
27
 
< 0.1%
31
 
< 0.1%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
082536
99.9%
136
 
< 0.1%
27
 
< 0.1%
31
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
082536
99.9%
136
 
< 0.1%
27
 
< 0.1%
31
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number82580
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
082536
99.9%
136
 
< 0.1%
27
 
< 0.1%
31
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common82580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
082536
99.9%
136
 
< 0.1%
27
 
< 0.1%
31
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII82580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
082536
99.9%
136
 
< 0.1%
27
 
< 0.1%
31
 
< 0.1%

BookingsCheckedIn
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct29
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.79840155
Minimum0
Maximum66
Zeros19394
Zeros (%)23.5%
Negative0
Negative (%)0.0%
Memory size645.3 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile1
Maximum66
Range66
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.6968798281
Coefficient of variation (CV)0.8728437816
Kurtosis1847.105958
Mean0.79840155
Median Absolute Deviation (MAD)0
Skewness27.07036159
Sum65932
Variance0.4856414949
MonotonicityNot monotonic
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
161737
74.8%
019394
 
23.5%
21141
 
1.4%
3132
 
0.2%
459
 
0.1%
520
 
< 0.1%
620
 
< 0.1%
716
 
< 0.1%
810
 
< 0.1%
99
 
< 0.1%
Other values (19)42
 
0.1%
ValueCountFrequency (%)
019394
 
23.5%
161737
74.8%
21141
 
1.4%
3132
 
0.2%
459
 
0.1%
520
 
< 0.1%
620
 
< 0.1%
716
 
< 0.1%
810
 
< 0.1%
99
 
< 0.1%
ValueCountFrequency (%)
661
 
< 0.1%
571
 
< 0.1%
401
 
< 0.1%
341
 
< 0.1%
293
< 0.1%
261
 
< 0.1%
251
 
< 0.1%
241
 
< 0.1%
232
< 0.1%
201
 
< 0.1%

PersonsNights
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct56
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.667958343
Minimum0
Maximum116
Zeros19396
Zeros (%)23.5%
Negative0
Negative (%)0.0%
Memory size645.3 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median4
Q37
95-th percentile12
Maximum116
Range116
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.562507174
Coefficient of variation (CV)0.9774095737
Kurtosis12.60176786
Mean4.667958343
Median Absolute Deviation (MAD)3
Skewness1.935882185
Sum385480
Variance20.81647172
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
019396
23.5%
613639
16.5%
410366
12.6%
29512
11.5%
88199
9.9%
14109
 
5.0%
33996
 
4.8%
103434
 
4.2%
123055
 
3.7%
91830
 
2.2%
Other values (46)5044
 
6.1%
ValueCountFrequency (%)
019396
23.5%
14109
 
5.0%
29512
11.5%
33996
 
4.8%
410366
12.6%
5898
 
1.1%
613639
16.5%
7221
 
0.3%
88199
9.9%
91830
 
2.2%
ValueCountFrequency (%)
1161
< 0.1%
781
< 0.1%
751
< 0.1%
731
< 0.1%
682
< 0.1%
621
< 0.1%
591
< 0.1%
561
< 0.1%
521
< 0.1%
511
< 0.1%

RoomNights
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct48
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.36941148
Minimum0
Maximum185
Zeros19394
Zeros (%)23.5%
Negative0
Negative (%)0.0%
Memory size645.3 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q34
95-th percentile6
Maximum185
Range185
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.2817575
Coefficient of variation (CV)0.9630060121
Kurtosis655.3970849
Mean2.36941148
Median Absolute Deviation (MAD)1
Skewness11.30982346
Sum195666
Variance5.20641729
MonotonicityNot monotonic
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
019394
23.5%
317084
20.7%
214031
17.0%
111231
13.6%
411083
13.4%
54997
 
6.1%
71915
 
2.3%
61830
 
2.2%
8368
 
0.4%
9195
 
0.2%
Other values (38)452
 
0.5%
ValueCountFrequency (%)
019394
23.5%
111231
13.6%
214031
17.0%
317084
20.7%
411083
13.4%
54997
 
6.1%
61830
 
2.2%
71915
 
2.3%
8368
 
0.4%
9195
 
0.2%
ValueCountFrequency (%)
1851
< 0.1%
1161
< 0.1%
951
< 0.1%
881
< 0.1%
661
< 0.1%
591
< 0.1%
511
< 0.1%
491
< 0.1%
421
< 0.1%
401
< 0.1%

DaysSinceLastStay
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1100
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean405.9354565
Minimum-1
Maximum1104
Zeros0
Zeros (%)0.0%
Negative19394
Negative (%)23.5%
Memory size645.3 KiB

Quantile statistics

Minimum-1
5-th percentile-1
Q142
median378
Q3698
95-th percentile982
Maximum1104
Range1105
Interquartile range (IQR)656

Descriptive statistics

Standard deviation346.5023411
Coefficient of variation (CV)0.8535897407
Kurtosis-1.289002372
Mean405.9354565
Median Absolute Deviation (MAD)330
Skewness0.2920894444
Sum33522150
Variance120063.8724
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-119394
 
23.5%
920203
 
0.2%
472196
 
0.2%
477165
 
0.2%
938158
 
0.2%
97156
 
0.2%
192144
 
0.2%
217144
 
0.2%
442126
 
0.2%
206126
 
0.2%
Other values (1090)61768
74.8%
ValueCountFrequency (%)
-119394
23.5%
21
 
< 0.1%
42
 
< 0.1%
54
 
< 0.1%
81
 
< 0.1%
93
 
< 0.1%
101
 
< 0.1%
113
 
< 0.1%
126
 
< 0.1%
1316
 
< 0.1%
ValueCountFrequency (%)
11043
 
< 0.1%
11021
 
< 0.1%
11012
 
< 0.1%
110012
 
< 0.1%
109912
 
< 0.1%
109836
< 0.1%
109744
0.1%
109637
< 0.1%
109514
 
< 0.1%
109455
0.1%

DaysSinceFirstStay
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1096
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean408.2450714
Minimum-1
Maximum1186
Zeros0
Zeros (%)0.0%
Negative19394
Negative (%)23.5%
Memory size645.3 KiB

Quantile statistics

Minimum-1
5-th percentile-1
Q144
median388
Q3705
95-th percentile983
Maximum1186
Range1187
Interquartile range (IQR)661

Descriptive statistics

Standard deviation347.2471272
Coefficient of variation (CV)0.850584983
Kurtosis-1.295653749
Mean408.2450714
Median Absolute Deviation (MAD)333
Skewness0.2832263141
Sum33712878
Variance120580.5673
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-119394
 
23.5%
920203
 
0.2%
472203
 
0.2%
477161
 
0.2%
938157
 
0.2%
97149
 
0.2%
217140
 
0.2%
192139
 
0.2%
442125
 
0.2%
206124
 
0.2%
Other values (1086)61785
74.8%
ValueCountFrequency (%)
-119394
23.5%
137
 
< 0.1%
1426
 
< 0.1%
1515
 
< 0.1%
1633
 
< 0.1%
1741
 
< 0.1%
1836
 
< 0.1%
1924
 
< 0.1%
2020
 
< 0.1%
2142
 
0.1%
ValueCountFrequency (%)
11861
 
< 0.1%
11171
 
< 0.1%
11161
 
< 0.1%
11111
 
< 0.1%
11043
 
< 0.1%
11021
 
< 0.1%
11012
 
< 0.1%
110012
 
< 0.1%
109912
 
< 0.1%
109836
< 0.1%

DistributionChannel
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
Travel Agent/Operator
67798 
Direct
11709 
Corporate
 
2565
Electronic Distribution
 
508

Length

Max length23
Median length21
Mean length18.51272705
Min length6

Characters and Unicode

Total characters1528781
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCorporate
2nd rowTravel Agent/Operator
3rd rowTravel Agent/Operator
4th rowTravel Agent/Operator
5th rowTravel Agent/Operator

Common Values

ValueCountFrequency (%)
Travel Agent/Operator67798
82.1%
Direct11709
 
14.2%
Corporate2565
 
3.1%
Electronic Distribution508
 
0.6%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
travel67798
44.9%
agent/operator67798
44.9%
direct11709
 
7.8%
corporate2565
 
1.7%
electronic508
 
0.3%
distribution508
 
0.3%

Most occurring characters

ValueCountFrequency (%)
r221249
14.5%
e218176
14.3%
t151394
 
9.9%
a138161
 
9.0%
o73944
 
4.8%
p70363
 
4.6%
n68814
 
4.5%
l68306
 
4.5%
68306
 
4.5%
T67798
 
4.4%
Other values (13)382270
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1173993
76.8%
Uppercase Letter218684
 
14.3%
Space Separator68306
 
4.5%
Other Punctuation67798
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r221249
18.8%
e218176
18.6%
t151394
12.9%
a138161
11.8%
o73944
 
6.3%
p70363
 
6.0%
n68814
 
5.9%
l68306
 
5.8%
g67798
 
5.8%
v67798
 
5.8%
Other values (5)27990
 
2.4%
Uppercase Letter
ValueCountFrequency (%)
T67798
31.0%
O67798
31.0%
A67798
31.0%
D12217
 
5.6%
C2565
 
1.2%
E508
 
0.2%
Space Separator
ValueCountFrequency (%)
68306
100.0%
Other Punctuation
ValueCountFrequency (%)
/67798
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1392677
91.1%
Common136104
 
8.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
r221249
15.9%
e218176
15.7%
t151394
10.9%
a138161
9.9%
o73944
 
5.3%
p70363
 
5.1%
n68814
 
4.9%
l68306
 
4.9%
T67798
 
4.9%
O67798
 
4.9%
Other values (11)246674
17.7%
Common
ValueCountFrequency (%)
68306
50.2%
/67798
49.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII1528781
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r221249
14.5%
e218176
14.3%
t151394
 
9.9%
a138161
 
9.0%
o73944
 
4.8%
p70363
 
4.6%
n68814
 
4.5%
l68306
 
4.5%
68306
 
4.5%
T67798
 
4.4%
Other values (13)382270
25.0%

MarketSegment
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
Other
47457 
Travel Agent/Operator
11482 
Direct
11278 
Groups
9501 
Corporate
 
2135
Other values (2)
 
727

Length

Max length21
Median length5
Mean length7.635408089
Min length5

Characters and Unicode

Total characters630532
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCorporate
2nd rowTravel Agent/Operator
3rd rowTravel Agent/Operator
4th rowTravel Agent/Operator
5th rowTravel Agent/Operator

Common Values

ValueCountFrequency (%)
Other47457
57.5%
Travel Agent/Operator11482
 
13.9%
Direct11278
 
13.7%
Groups9501
 
11.5%
Corporate2135
 
2.6%
Complementary484
 
0.6%
Aviation243
 
0.3%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
other47457
50.5%
travel11482
 
12.2%
agent/operator11482
 
12.2%
direct11278
 
12.0%
groups9501
 
10.1%
corporate2135
 
2.3%
complementary484
 
0.5%
aviation243
 
0.3%

Most occurring characters

ValueCountFrequency (%)
r107436
17.0%
e96284
15.3%
t84561
13.4%
O58939
9.3%
h47457
 
7.5%
o25980
 
4.1%
a25826
 
4.1%
p23602
 
3.7%
n12209
 
1.9%
l11966
 
1.9%
Other values (15)136272
21.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter502024
79.6%
Uppercase Letter105544
 
16.7%
Space Separator11482
 
1.8%
Other Punctuation11482
 
1.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r107436
21.4%
e96284
19.2%
t84561
16.8%
h47457
9.5%
o25980
 
5.2%
a25826
 
5.1%
p23602
 
4.7%
n12209
 
2.4%
l11966
 
2.4%
i11764
 
2.3%
Other values (7)54939
10.9%
Uppercase Letter
ValueCountFrequency (%)
O58939
55.8%
A11725
 
11.1%
T11482
 
10.9%
D11278
 
10.7%
G9501
 
9.0%
C2619
 
2.5%
Space Separator
ValueCountFrequency (%)
11482
100.0%
Other Punctuation
ValueCountFrequency (%)
/11482
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin607568
96.4%
Common22964
 
3.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
r107436
17.7%
e96284
15.8%
t84561
13.9%
O58939
9.7%
h47457
7.8%
o25980
 
4.3%
a25826
 
4.3%
p23602
 
3.9%
n12209
 
2.0%
l11966
 
2.0%
Other values (13)113308
18.6%
Common
ValueCountFrequency (%)
11482
50.0%
/11482
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII630532
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r107436
17.0%
e96284
15.3%
t84561
13.4%
O58939
9.3%
h47457
 
7.5%
o25980
 
4.1%
a25826
 
4.1%
p23602
 
3.7%
n12209
 
1.9%
l11966
 
1.9%
Other values (15)136272
21.6%

SRHighFloor
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
0
78656 
1
 
3924

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters82580
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
078656
95.2%
13924
 
4.8%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
078656
95.2%
13924
 
4.8%

Most occurring characters

ValueCountFrequency (%)
078656
95.2%
13924
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number82580
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
078656
95.2%
13924
 
4.8%

Most occurring scripts

ValueCountFrequency (%)
Common82580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
078656
95.2%
13924
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII82580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
078656
95.2%
13924
 
4.8%

SRLowFloor
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
0
82462 
1
 
118

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters82580
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
082462
99.9%
1118
 
0.1%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
082462
99.9%
1118
 
0.1%

Most occurring characters

ValueCountFrequency (%)
082462
99.9%
1118
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number82580
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
082462
99.9%
1118
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common82580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
082462
99.9%
1118
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII82580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
082462
99.9%
1118
 
0.1%

SRAccessibleRoom
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
0
82559 
1
 
21

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters82580
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
082559
> 99.9%
121
 
< 0.1%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
082559
> 99.9%
121
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
082559
> 99.9%
121
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number82580
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
082559
> 99.9%
121
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common82580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
082559
> 99.9%
121
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII82580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
082559
> 99.9%
121
 
< 0.1%

SRMediumFloor
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
0
82507 
1
 
73

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters82580
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
082507
99.9%
173
 
0.1%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
082507
99.9%
173
 
0.1%

Most occurring characters

ValueCountFrequency (%)
082507
99.9%
173
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number82580
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
082507
99.9%
173
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common82580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
082507
99.9%
173
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII82580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
082507
99.9%
173
 
0.1%

SRBathtub
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
0
82348 
1
 
232

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters82580
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
082348
99.7%
1232
 
0.3%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
082348
99.7%
1232
 
0.3%

Most occurring characters

ValueCountFrequency (%)
082348
99.7%
1232
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number82580
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
082348
99.7%
1232
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Common82580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
082348
99.7%
1232
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII82580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
082348
99.7%
1232
 
0.3%

SRShower
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
0
82437 
1
 
143

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters82580
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
082437
99.8%
1143
 
0.2%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
082437
99.8%
1143
 
0.2%

Most occurring characters

ValueCountFrequency (%)
082437
99.8%
1143
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number82580
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
082437
99.8%
1143
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common82580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
082437
99.8%
1143
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII82580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
082437
99.8%
1143
 
0.2%

SRCrib
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
0
81522 
1
 
1058

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters82580
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
081522
98.7%
11058
 
1.3%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
081522
98.7%
11058
 
1.3%

Most occurring characters

ValueCountFrequency (%)
081522
98.7%
11058
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number82580
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
081522
98.7%
11058
 
1.3%

Most occurring scripts

ValueCountFrequency (%)
Common82580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
081522
98.7%
11058
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII82580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
081522
98.7%
11058
 
1.3%

SRKingSizeBed
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
0
53539 
1
29041 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters82580
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
053539
64.8%
129041
35.2%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
053539
64.8%
129041
35.2%

Most occurring characters

ValueCountFrequency (%)
053539
64.8%
129041
35.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number82580
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
053539
64.8%
129041
35.2%

Most occurring scripts

ValueCountFrequency (%)
Common82580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
053539
64.8%
129041
35.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII82580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
053539
64.8%
129041
35.2%

SRTwinBed
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
0
70790 
1
11790 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters82580
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
070790
85.7%
111790
 
14.3%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
070790
85.7%
111790
 
14.3%

Most occurring characters

ValueCountFrequency (%)
070790
85.7%
111790
 
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number82580
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
070790
85.7%
111790
 
14.3%

Most occurring scripts

ValueCountFrequency (%)
Common82580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
070790
85.7%
111790
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII82580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
070790
85.7%
111790
 
14.3%

SRNearElevator
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
0
82555 
1
 
25

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters82580
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
082555
> 99.9%
125
 
< 0.1%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
082555
> 99.9%
125
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
082555
> 99.9%
125
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number82580
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
082555
> 99.9%
125
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common82580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
082555
> 99.9%
125
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII82580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
082555
> 99.9%
125
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
0
82287 
1
 
293

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters82580
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
082287
99.6%
1293
 
0.4%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
082287
99.6%
1293
 
0.4%

Most occurring characters

ValueCountFrequency (%)
082287
99.6%
1293
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number82580
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
082287
99.6%
1293
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Common82580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
082287
99.6%
1293
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII82580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
082287
99.6%
1293
 
0.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
0
82570 
1
 
10

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters82580
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
082570
> 99.9%
110
 
< 0.1%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
082570
> 99.9%
110
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
082570
> 99.9%
110
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number82580
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
082570
> 99.9%
110
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common82580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
082570
> 99.9%
110
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII82580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
082570
> 99.9%
110
 
< 0.1%

SRQuietRoom
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size645.3 KiB
0
75308 
1
 
7272

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters82580
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
075308
91.2%
17272
 
8.8%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
075308
91.2%
17272
 
8.8%

Most occurring characters

ValueCountFrequency (%)
075308
91.2%
17272
 
8.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number82580
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
075308
91.2%
17272
 
8.8%

Most occurring scripts

ValueCountFrequency (%)
Common82580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
075308
91.2%
17272
 
8.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII82580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
075308
91.2%
17272
 
8.8%

Total Revenue
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct16637
Distinct (%)20.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean368.3473957
Minimum0
Maximum23365
Zeros19667
Zeros (%)23.8%
Negative0
Negative (%)0.0%
Memory size645.3 KiB

Quantile statistics

Minimum0
5-th percentile0
Q180
median289
Q3498.75
95-th percentile1084.5
Maximum23365
Range23365
Interquartile range (IQR)418.75

Descriptive statistics

Standard deviation444.2943523
Coefficient of variation (CV)1.206182961
Kurtosis131.0727867
Mean368.3473957
Median Absolute Deviation (MAD)209.5
Skewness5.789490849
Sum30418127.94
Variance197397.4715
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
019667
 
23.8%
234295
 
0.4%
248.8205
 
0.2%
302199
 
0.2%
182189
 
0.2%
359187
 
0.2%
81184
 
0.2%
162172
 
0.2%
344171
 
0.2%
242171
 
0.2%
Other values (16627)61140
74.0%
ValueCountFrequency (%)
019667
23.8%
13
 
< 0.1%
215
 
< 0.1%
2.53
 
< 0.1%
310
 
< 0.1%
3.51
 
< 0.1%
417
 
< 0.1%
4.51
 
< 0.1%
56
 
< 0.1%
5.53
 
< 0.1%
ValueCountFrequency (%)
233651
< 0.1%
11930.661
< 0.1%
11081.151
< 0.1%
10982.41
< 0.1%
10324.51
< 0.1%
9576.81
< 0.1%
9188.751
< 0.1%
86071
< 0.1%
8110.981
< 0.1%
80031
< 0.1%

Interactions

Correlations

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Unnamed: 0IDNationalityAgeDaysSinceCreationAverageLeadTimeBookingsCanceledBookingsNoShowedBookingsCheckedInPersonsNightsRoomNightsDaysSinceLastStayDaysSinceFirstStayDistributionChannelMarketSegmentSRHighFloorSRLowFloorSRAccessibleRoomSRMediumFloorSRBathtubSRShowerSRCribSRKingSizeBedSRTwinBedSRNearElevatorSRAwayFromElevatorSRNoAlcoholInMiniBarSRQuietRoomTotal Revenue
001PRT51.015045103851511074CorporateCorporate0000000000000476.3
112PRTNaN10956100110511001100Travel Agent/OperatorTravel Agent/Operator0000000000000333.0
223DEU31.01095000000-1-1Travel Agent/OperatorTravel Agent/Operator00000000000000.0
334FRA60.010959300110511001100Travel Agent/OperatorTravel Agent/Operator0000000000000300.0
445FRA51.01095000000-1-1Travel Agent/OperatorTravel Agent/Operator00000000000000.0
556JPN54.01095580014210971097Travel Agent/OperatorOther0000000000000254.0
667JPN49.01095000000-1-1Travel Agent/OperatorOther00000000000000.0
778FRA32.010953800110511001100Travel Agent/OperatorOther0000000100000629.0
889FRA42.01095000000-1-1Travel Agent/OperatorOther00000001000000.0
9910IRL25.01095960016310981098Travel Agent/OperatorTravel Agent/Operator0000000000000243.0

Last rows

Unnamed: 0IDNationalityAgeDaysSinceCreationAverageLeadTimeBookingsCanceledBookingsNoShowedBookingsCheckedInPersonsNightsRoomNightsDaysSinceLastStayDaysSinceFirstStayDistributionChannelMarketSegmentSRHighFloorSRLowFloorSRAccessibleRoomSRMediumFloorSRBathtubSRShowerSRCribSRKingSizeBedSRTwinBedSRNearElevatorSRAwayFromElevatorSRNoAlcoholInMiniBarSRQuietRoomTotal Revenue
825708257082571ROU37.01264001841616Travel Agent/OperatorOther0000000000000409.20
825718257182572ROU36.012000000-1-1Travel Agent/OperatorOther00000000000000.00
825728257282573FRA32.01288001421414Travel Agent/OperatorTravel Agent/Operator0000000000000154.00
825738257382574FRA34.012000000-1-1Travel Agent/OperatorTravel Agent/Operator00000000000000.00
825748257482575PRT41.0127001111313Travel Agent/OperatorOther000000000000082.18
825758257582576SWE51.012114001631515Travel Agent/OperatorOther0000000100000196.08
825768257682577SWE50.012000000-1-1Travel Agent/OperatorOther00000001000000.00
825778257782578DEU50.01218001331515Travel Agent/OperatorOther0000000000000477.00
825788257882579PRTNaN1211001331515Travel Agent/OperatorOther0000000000000264.00
825798257982580DEU17.012000000-1-1Travel Agent/OperatorOther00000000000000.00